Skip to content

Conversation

@aswinshakil
Copy link
Member

What changes were proposed in this pull request?

In #5339, We expanded the snapshot's deletedDirTable into fileTable to calculate the exclusive size for the files under the deleted directories. The problem with that approach is that if we alter the fileTable and dirTable of a snapshot, it would change the snapDiff result and yield undesirable results. For eg, Consider a deleted directory structure as below,

dir1 --- subdir1 --- file1 to file100
    | 
     ---- subdir2 --- file1 to file100

Instead of snapDiff saying DELETED dir1, It would say
DELETED subdir1
DELETED subdir2
DELETED subdir1/file1 to file100
DELETED subdir2/file1 to file100

In the new approach in this patch, We do an in-memory walk for each deleted directory and deep clean them, and also calculate the exclusive size for files when doing so. For each file, we do the same as #5301

The reason we are doing a deep clean, There could be files under a directory trapped in between snapshots. Consider this case,

Create dir1
Create snapshot1
Create 1000 files under dir1 --> These files are trapped and not deleted unless the snapshot itself is deleted.
Delete dir1
Create snapshot2

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9426

How was this patch tested?

Added Integration test.

@aswinshakil aswinshakil added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label Nov 10, 2023
@aswinshakil aswinshakil self-assigned this Nov 10, 2023
Copy link
Contributor

@hemantk-12 hemantk-12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @aswinshakil

Copy link
Contributor

@hemantk-12 hemantk-12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall changes look good to me. Left some minor and cosmetic comments.

We can merge this once you resolve them.

try {
count = cluster.getOzoneManager().getMetadataManager()
.countRowsInTable(table);
LOG.info("{} actual row count={}, expectedCount={}", table.getName(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log statement is OK for debugging but not needed to be added.

Copy link
Contributor

@hemantk-12 hemantk-12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating it @aswinshakil

LGTM except the seek and next as in comment: https://github.com/apache/ozone/pull/5579/files#r1413457354

Copy link
Contributor

@hemantk-12 hemantk-12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@swamirishi swamirishi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aswinshakil Thanks for the patch. I have a few major review comments after going through the PR so far. I haven't gone through the entire code and logic yet. These are some of the review comments you can address in the meanwhile I review the rest of the PR.

}
}

if (directoryIterator.hasNext()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this loop into the else condition. In one iteration we will either delete files in the directory or process the sub directory. Let us not do both. Then we wouldn't be requiring the next.next in the else loop condition. It makes the logic complicated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the current directory reference to add the next sub-dir to the stack. So it is simpler to do it in the same iteration than the next iteration.

// If the previous to previous snapshot doesn't
// have the key, then it is exclusive size for the
// previous snapshot.
if (keyInfoPrevToPrevSnapshot == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are supposed to look for the version diff, in case of object versioning. Exclusive size would be the size of the versions not found in the previous to previous and found in the previous snapshot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the version size being checked?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One version of the object would correspond to only one snapshot we need to check which version this is. It could be the case the key was modified and we are deleting previous versions or we are keeping all object versions.

Copy link
Contributor

@swamirishi swamirishi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some suggestion. You may take it if it is helpful

stackTop.getDirValue().getObjectID(), "");
stackTop.setSubDirSeek(seekDirInDB);
} else {
// Adding \0 to seek the next greater element.
Copy link
Contributor

@swamirishi swamirishi Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hemantk-12 Can you confirm if adding "\0" is a correct approach. @aswinshakil it would be great to add this edge case in the unit test if it is not there. From what I know this should work since utf-8 encoding is orders strings lexicographically even in case of \0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me. As you said, we have to test it thoroughly.

Copy link
Contributor

@swamirishi swamirishi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aswinshakil thanks for the patch. I have raised a follow up jira for fixing the exclusive size calculation for buckets with object versioning enabled. https://issues.apache.org/jira/browse/HDDS-10051

@aswinshakil aswinshakil merged commit d969689 into apache:master Jan 3, 2024
@aswinshakil
Copy link
Member Author

Thank you @hemantk-12 and @swamirishi for the reviews!

adoroszlai added a commit to adoroszlai/ozone that referenced this pull request Jan 22, 2024
adoroszlai added a commit that referenced this pull request Jan 22, 2024
…'s deleted directories. (#5579)" (#6051)

Reason for revert: incompatible proto changes

This reverts commit 0528494.
This reverts commit d969689.
Tejaskriya pushed a commit to Tejaskriya/ozone that referenced this pull request Jan 24, 2024
…'s deleted directories. (apache#5579)" (apache#6051)

Reason for revert: incompatible proto changes

This reverts commit 0528494.
This reverts commit d969689.
swamirishi pushed a commit to swamirishi/ozone that referenced this pull request Dec 3, 2025
…shot's deleted directories. (apache#5579)

Change-Id: Ibea7794613f1702fb478fd70eeef7555255d697a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants